AITopics

Country: Asia > China (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsFeb-8-2026, 14:27:06 GMT

GRASP: NavigatingRetrosyntheticPlanningwith Goal-drivenPolicy

Retrosynthetic planning occupies a crucial position in synthetic chemistry and, accordingly, drug discovery, which aims to find synthetic pathways of a target molecule through a sequential decision-making process on a set of feasible reactions.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Zhejiang Province (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Industry: Materials > Chemicals (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Neural Information Processing SystemsFeb-7-2026, 13:14:49 GMT

12ffb0968f2f56e51a59a6beb37b2859-AuthorFeedback.pdf

experiment, occupancy, prediction, (12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Wang, Mianchu, Montana, Giovanni

Retrosynthesis Planning via Worst-path Policy Optimisation in Tree-structured MDPs

arXiv.org Artificial IntelligenceNov-19-2025

Retrosynthesis planning aims to decompose target molecules into available building blocks, forming a synthetic tree where each internal node represents an intermediate compound and each leaf ideally corresponds to a purchasable reactant. However, this tree becomes invalid if any leaf node is not a valid building block, making the planning process vulnerable to the "weakest link" in the synthetic route. Existing methods often optimise for average performance across branches, failing to account for this worst-case sensitivity. In this paper, we reframe retrosynthesis as a worst-path optimisation problem within tree-structured Markov Decision Processes (MDPs). We prove that this formulation admits a unique optimal solution and provides monotonic improvement guarantees. Building on this insight, we introduce Interactive Retrosynthesis Planning (InterRetro), a method that interacts with the tree MDP, learns a value function for worst-path outcomes, and improves its policy through self-imitation, preferentially reinforcing past decisions with high estimated advantage. Empirically, InterRetro achieves state-of-the-art results - solving 100% of targets on the Retro*-190 benchmark, shortening synthetic routes by 4.9%, and achieving promising performance using only 10% of the training data.

artificial intelligence, machine learning, molecule, (18 more...)

2509.10504

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsOct-2-2025, 03:32:22 GMT

12ffb0968f2f56e51a59a6beb37b2859-Paper.pdf

The choice of the model's prediction horizon constitutes

machine learning, neural information processing system, reinforcement learning, (11 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsOct-2-2025, 03:32:11 GMT

12ffb0968f2f56e51a59a6beb37b2859-AuthorFeedback.pdf

We thank the reviewers for their insights and suggestions. Answers below will be included in expanded discussions in future versions of the paper. In the case of R3's car example, as long as states from 10 steps into the future are sampled This is discussed in L211-L215 in Section 6 "Practical Training of γ -Models". The only Monte Carlo trajectory estimates are in the final column for comparison.

experiment, machine learning, reinforcement learning, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Neural Information Processing SystemsAug-14-2025, 11:09:20 GMT

42beaab8aa8da1c77581609a61eced93-Paper-Conference.pdf

molecule, pathway, reaction, (16 more...)

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Zhejiang Province (0.04)
Asia > China > Hong Kong (0.04)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.68)

arXiv.org Artificial IntelligenceSep-6-2024

A high-accuracy multi-model mixing retrosynthetic method

Xiang, Shang, Yao, Lin, Wang, Zhen, Yu, Qifan, Liu, Wentan, Guo, Wentao, Ke, Guolin

The field of computer-aided synthesis planning (CASP) has seen rapid advancements in recent years, achieving significant progress across various algorithmic benchmarks. However, chemists often encounter numerous infeasible reactions when using CASP in practice. This article delves into common errors associated with CASP and introduces a product prediction model aimed at enhancing the accuracy of single-step models. While the product prediction model reduces the number of single-step reactions, it integrates multiple single-step models to maintain the overall reaction count and increase reaction diversity. Based on manual analysis and large-scale testing, the product prediction model, combined with the multi-model ensemble approach, has been proven to offer higher feasibility and greater diversity.

molecule, reaction, template, (16 more...)

2409.04335

Country:

North America > United States (0.14)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.69)

arXiv.org Artificial IntelligenceJun-9-2024

Long-Horizon Rollout via Dynamics Diffusion for Offline Reinforcement Learning

Zhao, Hanye, Han, Xiaoshen, Zhu, Zhengbang, Liu, Minghuan, Yu, Yong, Zhang, Weinan

With the great success of diffusion models (DMs) in generating realistic synthetic vision data, many researchers have investigated their potential in decision-making and control. Most of these works utilized DMs to sample directly from the trajectory space, where DMs can be viewed as a combination of dynamics models and policies. In this work, we explore how to decouple DMs' ability as dynamics models in fully offline settings, allowing the learning policy to roll out trajectories. As DMs learn the data distribution from the dataset, their intrinsic policy is actually the behavior policy induced from the dataset, which results in a mismatch between the behavior policy and the learning policy. We propose Dynamics Diffusion, short as DyDiff, which can inject information from the learning policy to DMs iteratively. DyDiff ensures long-horizon rollout accuracy while maintaining policy consistency and can be easily deployed on model-free algorithms. We provide theoretical analysis to show the advantage of DMs on long-horizon rollout over models and demonstrate the effectiveness of DyDiff in the context of offline reinforcement learning, where the rollout dataset is provided but no online environment for interaction. Our code is at https://github.com/FineArtz/DyDiff.

arxiv preprint arxiv, dydiff, trajectory, (14 more...)

2405.19189

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceOct-8-2023

Evolutionary Retrosynthetic Route Planning

Zhang, Yan, Hao, Hao, He, Xiao, Gao, Shuanhu, Zhou, Aimin

Molecular retrosynthesis is a significant and complex problem in the field of chemistry, however, traditional manual synthesis methods not only need well-trained experts but also are time-consuming. With the development of big data and machine learning, artificial intelligence (AI) based retrosynthesis is attracting more attention and is becoming a valuable tool for molecular retrosynthesis. At present, Monte Carlo tree search is a mainstream search framework employed to address this problem. Nevertheless, its search efficiency is compromised by its large search space. Therefore, we propose a novel approach for retrosynthetic route planning based on evolutionary optimization, marking the first use of Evolutionary Algorithm (EA) in the field of multi-step retrosynthesis. The proposed method involves modeling the retrosynthetic problem into an optimization problem, defining the search space and operators. Additionally, to improve the search efficiency, a parallel strategy is implemented. The new approach is applied to four case products, and is compared with Monte Carlo tree search. The experimental results show that, in comparison to the Monte Carlo tree search algorithm, EA significantly reduces the number of calling single-step model by an average of 53.9%. The time required to search three solutions decreased by an average of 83.9%, and the number of feasible search routes increases by 5 times.

algorithm, retrosynthesis, search space, (15 more...)